Computer system and a method for controlling a computer system

ABSTRACT

In a cluster system including a plurality of operating systems operating on one computer, computer resources can be updated for and reallocated to each operating system. When the operating systems are used as active or standby operating systems, a multiple operating system management controller monitors the state of each operating system. At a failure of an active operating system, the controller allocates a larger part of computer resources to another operating system in a normal state and assigns the operating system as a new active operating system. Regardless of the failure, the computer system can be operated without changing processing capability thereof. The controller can monitor load of each operating system to allocate computer resources to the operating system according to the load.

BACKGROUND OF THE INVENTION

The present invention relates to a computer system and a method forcontrolling a computer system in which a plurality of operating systems(OS) operate on one computer, and in particular, to a computer systemand a method for controlling a computer system in which computerresources can be efficiently used or operated in an operating state of aplurality of operating systems.

To improve reliability of computers, the JP-A-11-353292 describes acontrol technique of load distribution of the background art. Thetechnique is called “cluster system” in which when an operating systemrunning on a computer fails, another operating system receivesprocessing of the failed operating system to continue the processing. Inthe system of the background art, at a failure of an operating system ona computer, programs running on the computer can be continuouslyexecuted.

To operate a plurality of operating systems in the cluster system whichis a computer system of the background art, an equal number of computersand operating systems are required. The cluster system of the backgroundart includes active computers and standby computers, and the active andstandby computers operate independent of each other. In this connection,an active computer is a computer ordinarily operating, and a standbycomputer is a computer arranged to take over, when the active computerfails, processing of the failed computer to thereby continue theprocessing.

Therefore, in the cluster system of the background art, when a failureoccurs on an active computer, processing thereof is passed to a standbycomputer. However, when a process is running on the standby computer,entire processing capability of the active computer cannot be achievedby the standby computer. In the background art system, while the standbysystem is kept stopped to take over the processing of the activecomputer, the computer resource of the standby computer cannot beefficiently used.

In a cluster system including, for example, two computers, one computeris an active system and the other one computer is a standby system. Atoccurrence of a failure in the active computer, when it is necessarythat the standby computer achieves entire processing capability of theactive computer failed, the standby computer must have processingcapability at least equal to that of the failed computer. Moreover, atthe failure in the active computer, the standby computer must stopprocessing being executed on the standby computer and then takes overprocessing from the failed computer to execute the processing. Toexecute the processing taken from the failed computer without stoppingthe own processing being executed by the standby computer, the standbycomputer must have processing capability higher than that of the failedcomputer.

Consequently, the cluster system of the background art is attended witha problem in which to achieve processing capability of one computer ofthe ordinary active system, there are required two computers each ofwhich has processing capability equal to or more than that of thecomputer of the active system.

SUMMARY OF THE INVENTION

It is therefore an object of the present invention, which has beendevised to remove the problems, to provide a computer system and amethod for controlling a computer system wherein a cluster system isconfigured to improve computer reliability and the system can beeffectively operated by more efficiently using computer resources.

In accordance with the present invention, the object can be achieved bya method in which a plurality of operating systems are operated in onecomputer system and computer resources are updated and reallocated toefficiently use the computer resources according to states of operatingsystems in the computer system. For example, a cluster system isconfigured in one computer system including an operating system of anactive system and another operating system of a standby system. A largerpart of the computer resources are allocated to the active operativesystem. When a failure occurs in the active operating system, thestandby operating system takes over processing of the failed operatingsystem. The larger part of computer resources are allocated to thestandby operating system, which then starts operation as a new activeoperating system. This leads to more efficient use of the computerresources.

The object can be achieved by an operation in which the system monitorsnot only failures of operating systems, but also various states using anenvironment of update and reallocation of the computer resources toefficiently use the computer resources according to the states thusmonitored. For example, when the number of processing assigned to anoperating system increases, load on the operating system becomes higher.In this situation, a larger number of computer resources are assigned tothe operating system to increase processing capability of the operatingsystem. This relatively decreases the load imposed on the operatingsystem. When load is increased in all operating systems, standbycomputer resources such as standby central processing units (CPU) andstandby memories are added to the system to increase system processingcapability. After the processing is terminated, the computer resourcesadded are released. Therefore, the computer resources can be efficientlyused.

The object can be achieved by a method in which when a plurality ofusers accessing a computer system by reallocating computer resourcesdesire a larger part of computer resources in a particular state, eachuser registers his or her request to a management section of thecomputer system. After the desired computer resources are used, the useris charged according to a ratio of the used computer resources. Thecomputer system can be therefore developed as a server for a pluralityof users. It is possible to provide the computer system with flexibleusability including an account system or a charging system.

The object can be achieved by a system including a multiple operatingsystem management controller which operates a plurality of operatingsystems in one computer system. The multiple operating system managementcontroller manages all operating systems and provides a shared memoryfor the operating systems. Using the shared memory, the operatingsystems can communicate under supervision of the multiple operatingsystem management controller with each other via the shared memorywithout using any external input/output device. Using the communicatingfunction, the controller generates a communication protocol equivalentto a protocol used by a network to thereby support communication withthe shared memory. That is, the system can implement a virtual externalinput/output device on the main memory without using any particularprotocol.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention will be more apparent from the following detaileddescription, when taken in conjunction with the accompanying drawings,in which:

FIG. 1 is a block diagram showing constitution of a first embodiment ofa computer system of the present invention;

FIG. 2 is a block diagram schematically showing a system constructionincluding a cluster configuration of operating systems running in acomputer system;

FIG. 3 is a diagram showing tables to control a computer state;

FIG. 4 is a flowchart showing operation to update computer resources ata failure in an operating system;

FIG. 5 is a diagram showing tables necessary for a second embodiment ofthe present invention;

FIG. 6 is a diagram for explaining a computer resource update table toupdate computer resource ratios according to load on operating systems;

FIG. 7 is a diagram for explaining a computer resource update statetable containing load on computer resources;

FIG. 8 is a flowchart showing an operation procedure for each operatingsystem to update computer resources;

FIG. 9 is a diagram for explaining a layout of a computer resourceupdate table for a batch processing system;

FIG. 10 is a flowchart for explaining operation to update computerresources of a batch processing system;

FIG. 11 is a diagram for explaining constitution of a user configurationarea including a user account table and a user register table necessaryfor a third embodiment of the present invention;

FIG. 12 is a diagram for explaining a layout of a computer resourceaccount standard area including account standard tables necessary foraccounting calculation;

FIG. 13 is a flowchart for explaining an accounting procedure accordingto use of computer resources;

FIG. 14 is a block diagram schematically showing a fourth embodiment ofa computer system in which operating systems are configured in a clusterlayout;

FIG. 15 is a flowchart for explaining a transmission procedure forcommunication with a shared memory; and

FIG. 16 is a flowchart for explaining a reception procedure forcommunication with a shared memory.

DESCRIPTION OF THE EMBODIMENTS

Referring now to the accompanying drawings, description will be given indetail of an embodiment of a computer system and an embodiment of amethod of controlling a computer system in accordance with the presentinvention.

FIG. 1 shows structure of a first embodiment of a computer system in ablock diagram. FIG. 2 shows in a block diagram a schematic configurationof a computer system including operating systems in a cluster layout.FIG. 3 shows tables to control states of a computer. FIG. 4 shows, in aflowchart, operation to update computer resources at a failure of anoperating system. In FIG. 1, a computer system 100 includes a processorgroup 110, a main memory 120, an external input/output device group 130,a multiple operating system (multi OS) management control supporter 140,and a bus 150 establishing connections between the constituentcomponents. In FIG. 2, a multiple operating system management controlsupporter 200 controls first and second operating systems 201 and 202,cluster services 203 and 204, monitor agents 205 and 206, applications207 and 208, first and second operating system CPU groups 221 and 222,and terminals 230 to 232.

The computer system 100 of the first embodiment includes a processorgroup 110, a main memory 120, an external input/output device group 130,a multiple operating system management control supporter 140 to operatea plurality of operating systems in one computer system, and a bus 150connecting the constituent components to each other as shown in FIG. 1.To enable one computer system to execute a plurality of operatingsystems, the supporter 140 intervenes between the processor group 110,the main memory 120, and the external device group 130 of the computersystem 100. The processor group 110 is a set of one or more processors(CPUs) and includes processors 11 a, 11 b, . . . , 11 n. The main memory120 includes main memory areas 12 a to 12 n independent of each other.The areas 12 a to 12 n are allocated respectively to operating systems,i.e., operating system 1 (OS-1) to operating system n (OS-n). The mainmemory 120 further includes a main storage area 121 for the multipleoperating system management controller and an empty area 122 to bereserved. The external device group 130 includes an input keyboard 131,an output display 132, external storages 133 to 135 for respectiveoperating systems, and communicating devices 136 and 137, and otherdevices, not shown.

In the example of FIG. 2 schematically showing operating systems runningin a computer system configured as shown in FIG. 1, the first and secondoperating systems OS-1 and OS-2 are arranged in a cluster layout. Theexternal storage 133 is allocated to OS-1, the external device 134 isallocated to OS-2, the disk device 135 is shared between OS-1 and OS-2,the communication devices 136 and 137 are allocated to OS-1, and thecommunication devices 138 and 139 are allocated to OS-2. Theconfiguration further includes a cable 211 to connect the communicationdevice 136 of OS-1 to the communication device 139 of OS-2 and a cable212 to connect the communication device 137 of OS-1 to the communicationdevice 138 of OS-2. The cable 211 is a cable to establish externalconnection to other computers. The cable 212 is a cable for clusterconnection only between clusters, namely, between OS-1 and OS-2. FIG. 2shows two kinds of cables. This is only to improve reliability of thecluster, and the cluster layout can be constructed without using thecable 212. The CPU group 221 is allocated to OS-1 and the CPU group 222is allocated to OS-2.

The multiple operating system management controller 200 is allocated tothe multiple operating system management controller area 121 of the mainmemory 120 in FIG. 1. The controller 200 updates computer resources,namely, the processor group 110, the main memory 120, and the externalinput/output device group 130. Any terminals 230 to 232 to be connectedto clusters are connected to the cable 211. One of the terminals, forexample, the terminal 230 may serve a function as a manager terminalwhich controls states of OS-1 and OS-2 and notifies, according to thestates, update information of computer resources via the cable 211 andeach operating system.

The first and second operating systems 201 and 202 are respectivelyallocated to the OS-1 area 12a and the OS-2 area 12b of the main memory120. The cluster services 203 and 204, the monitor agents 205 and 206,and applications 207 and 208 for the respective operating systems arerespectively stored in the areas 12 a and 12 b. Each of the monitoragents 205 and 206 is a program which analyzes computer system statesand which requests update and/or allocation of computer resources to themultiple operating system management controller according to thecomputer system states to thereby efficiently use the computerresources.

The CPU groups 221 and 222 for the first and second operating systemsare allocated in either one of two methods. First, CPU-1 and CPU-2 ofthe CPU group 110 are allocated to the first operating system and arecollectively called a CPU group 221. CPU-3 and CPU-4 of the CPU group110 are allocated to the second operating system and are collectivelycalled a CPU group 222. Each CPU is operated only by the associatedoperating system. Second, CPU1 and CPU2 are allocated to both CPU groups220 and 221. Each CPU is allocated to another operating system at apredetermined interval of time. Either method may be used to allocatethe CPU groups to the operating systems according to the presentinvention. In accordance with the present invention, the computerresources such as the CPU and the memory are updated to be allocated toa plurality of operating systems of the cluster system according to thestates of the operating systems. For example, when the cluster systemincludes two operating systems in which a first operating systemoperates as an active system and a second operating system operates as astandby system, a larger number of computer resources are allocated tothe active operating system. The allocation of computer resources to theoperating systems is managed using a computer resource management tableor computer state table 300.

The table 300 to manage allocation of computer resources to OS-1 andOS-2 in the cluster system is configured as shown in FIG. 3 and isstored in an area, not shown, of the main memory 120. The multipleoperating system management control supporter 140 manages the table 300.This is also the case with all tables which will be described later. Thetable 300 includes a computer resource allocation table 310 and anoperating system state table 320. In the supporter 140 managing thetable 300, a processing executing section is disposed for each resource.The processing executing section recognizes the computer states and thecomputer resource state according to values in the table 300 and updatesallocation of the computer resource. The computer resource allocationtable 310 includes entries of which each includes a computer resourcename field 311, a computer resource service ratio field 312 for theactive computer, and a computer resource service ratio field 313 for thestandby computer. This provides a relationship between the computerresource and a role of the computer in the cluster system. Each entry ofthe operating system state table 320 includes an operating system namefield 321, a state information field 322 of each operating system, aninformation field 323 to indicate that the operating system is an activeor standby system, and a service ratio field 324 indicating a serviceratio of resource (CPU) for the operating system. The table 320 providesthe state of operation of each operating system.

The computer resource management table or the computer state table 300having a structure as shown in FIG. 3 is updated by the processingexecuting sections which manage the tables 310 and 320 according to thecluster configuration and the allocation of roles of the processingexecuting sections of the respective resources. In the table 300 shownin FIG. 3, 95% of the CPU resources are allocated to the active OS-1 and5% thereof are allocated to the standby OS-2. Although FIG. 3 does notshow allocation of areas of the main memory, a larger quantity of themain memory is allocated to the active OS-1. The main memory allocationmay be conducted according to a range of addresses, not by the serviceratio. Although the resource name fields of the computer resourceallocation table 310 are “CPU” and “main memory”, the number of externalstorages reserved for use may also be registered to the table 310.

Referring next to the flowchart of FIG. 4, description will be given ofoperation to update computer resources at a failure of an operatingsystem. In this example, the monitor agents 205 and 206 monitor thecomputer resource table 300 shown in FIG. 3. At a failure of OS-1 in theactive system, the cluster services 203 and 204, the operating systemmanagement controller 200, and the monitor agents 205 and 206cooperatively update computer resources.

(1) The monitor agents 205 and 206 periodically communicate respectivelywith the cluster services 203 and 204 and with the multiple operatingsystem management controller 200. Each of the monitor agents 205 and 206acquires from either one of the communicating partners informationwhether or not a failure has occurred in an operating system other thanthe operating system under which the pertinent monitor agent is running(step 400).

(2) The monitor agent 205 or 206 determines whether or not a failure hasoccurred in an operating system other than the operating system underwhich the pertinent monitor agent is running. If no failure is detected,processing is terminated (step 401).

(3) If a failure is detected in OS-1 as an active operating system, themonitor agent 206 running under the standby operating system detectsoccurrence of a failure in OS-1 in the active system. The monitor agent206 of the standby OS-2 then accesses the computer resource allocationtable 310 to acquire therefrom information indicating how to updatecomputer resources in the pertinent situation (step 402).

(4) In this example, the monitor agent 206 under control of the standbyoperating system detects a failure in the active operating system.Therefore, at a failure in the active operating system, the standbyoperating system takes over processing of the active system and thenstarts its operation as an active operating system. Consequently, theCPU allocation ratio of the standby operating system up to this point ischanged to the CPU allocation ratio of the current operating system. Themonitor agent 206 notifies the update of computer resources to themultiple operating system management controller 200 which actuallymanages operation of the CPU (step 403).

(5) After the controller 200 conducts the pertinent processing, theoperating system state table 320 is updated. The operation state, thetype of system, and the CPU use ratio are updated. After the occurrenceof the failure, the active operating system becomes a new standbyoperating system, the standby operating system becomes a new activeoperating system, and the CPU use ratio is also accordingly changed to avalue in the table 320. The main memory use ratio is also updated (step404).

(6) As preparation for a case in which the new standby operating systemis restarted or returned, current information, namely, information thatthe restarted operating system operates as a standby system is notifiedto the monitor agent which will again operate under control of therestarted operating system. The processing is thereby terminated (step405).

In the example of FIG. 4, the monitor agents monitor the computer statetable 300. However, the table 300 may be managed by the manager terminal230, not by the monitor agents. In this case, the manager terminal 230can manage the states of the operating systems in a centralized manner,and hence the processing of step 405 becomes unnecessary. By connectingthe client services of the respective cluster services 203 and 204 tothe manager terminal 230, the states of the respective clusters can beobtained. In the operation, the manager terminal 230 need periodicallyissue, in the processing of step 400, an inquiry to the cluster services203 and 204 for the operation states of the respective operatingsystems. In this example, the standby operating system is basically thesame as the active operating system. However, the operating systems maydiffer from each other. It is only necessary that each operating systemcan execute processing of the same applications.

In the first embodiment of the present invention, a larger part of theCPU resources are allocated to the active operating system with a largeruse ratio. At occurrence of a failure in the active operating system,the larger part of the CPU resources can be allocated to the standbyoperating system which receives processing of the failed operatingsystem and which starts operation as a new active operating system. Thecomputer resources can be therefore more efficiently used.

In the first embodiment of the present invention, the allocation ratioof computer resources to each operating system is determined accordingto the operation mode of the operating system, namely, the active orstandby operating system. When the operation mode is changed, thecomputer resource allocation ratio is changed. However, in accordancewith the present invention, the computer resource allocation ratio maybe changed according to amount of processing load on each operatingsystem or according to a particular time to efficiently use the computerresources. Next, a second embodiment of the present invention will bedescribed.

FIG. 5 shows constitution of tables necessary for the second embodiment.FIG. 6 shows a computer resource update table containing computerresource update information when a high load is imposed onto anoperating system. FIG. 7 shows constitution of a computer resourceupdate table including load of computer resources. FIG. 8 shows in aflowchart a processing procedure for each operating system to updatecomputer resources.

The tables of FIG. 5 necessary for the second embodiment representoverall resources of the computer system and resource allocation to eachoperating system. The tables are stored in a computer resourceconfiguration area 500. That is, the area 500 includes a CPU use table510, a main memory use table 520, an external device use table 530, aphysical CPU configuration table 540, and an operating systemconfiguration table 550.

The CPU use table 510 includes an item field 511. The number of CPUs inuse and the number of CPUs not in use are registered to the fields 511.The table 510 further includes an entry field 512 of a value indicatingthe number of CPUs. The data items are used to change resources. Themain memory use table 520 includes an item field 521. The memorycapacity in use and the memory capacity not in use are registered to thefields 521. The table 520 further includes a value field 522. The valuesare registered to the fields 522 corresponding to the fields 521. Theexternal device use table 530 includes an item field 531. Externalinput/output devices are registered to the fields 531. The table 530includes an information field 532. Whether or not the externalinput/output device is in use is registered to each associated field532. The physical CPU configuration table 540 indicates states of CPUsindependently used in the CPU group 110. The physical CPU configurationtable 540 includes an item field 541. Identifiers of operation systemsto subdivide the CPU group 110 are registered to the fields 541. Foreach operating system identifier, priority among the operating systemsis registered to a priority field 542. An initial number of CPUsresultant from initial subdivision of the CPU group 110 is registered toan initial no field 543. A current number of CPUs actually updated by amonitor agent periodically updating the table 540 is registered to acurrent number field 544. The priority field 542 contains a prioritylevel which is used to change the resources when the operating systemfurther requests CPUs. The operating system configuration table 550 is atable to which initial states of the respective operating systems areregistered. The table includes a name field 551. For each operatingsystem name 551, an execution priority level 552 between the operatingsystems, a type of operating system 553, an identifier 554 of a physicalCPU configuration table, a CPU service ratio 555, a main memory usecapacity 556, and external devices in use 556 and 557 are registered tothe table 550.

In the tables 550 of FIG. 5, the number of the external devices in useis updated according to connections of the computer system 100. The typeof operating system 553 is a transaction type to continuously executeprocessing and a batch type to execute accumulated processing at a time.In the second embodiment of the present invention, the computerresources are updated in consideration of the difference in thecharacteristic between the processing systems. This leads to moreefficient allocation of the computer resources.

To the computer resource update table 600 of FIG. 6 containinginformation to update computer resources when load is imposed on anoperating system, an operating system name 601, an update ruleidentifier 602 for each operating system, and other information itemsare registered as below. A wait flag 603 is a flag which indicates, whenthere exists a state in which computer resources cannot be actuallyupdated for the condition of the identifier 602 indicating a rule,whether or not the update of computer resource is again retried. Thewait flag 603 contains a numeric value. Each time an agent whichperiodically conducts the monitoring operation executes processing ofthe flowchart shown in FIG. 8, the agent checks the wait flag 603 todetermine whether or not computer resource can be updated. The value ofthe flag 603 indicates the number of checks made by the agent. For theoperating system name 601, the monitor agents 205 and 206 check theoperating system load according to a computer resource update condition604. The condition 604 includes the respective threshold data accordingto CPU 606, a main memory 607, and the external input/output device 608for each computer resource. If the registered conditions are satisfiedas a result of the check, the monitor agent 205 and 206 issues a requestfor configuration according to the values of the CPU 609, the mainmemory 610, and the external input/output device 611 of a computerresource update configuration 605. If there exists any setting sharedbetween the operating system, each operating system selects a ruleshared therebetween.

The computer resource update state table 700 shown in FIG. 7 contains anoperating system name 701 and states of each operating system name 701.The states include load of each computer resource allocated to eachoperating system. The monitor agent determines, according to the valueof load, whether or not computer resources are to be updated. Themonitor agent also manages the states of computer resources after theupdate. The states are managed together with default values to restorethe values. For each operating system name 701, current computer sourcesincluding a CPU 702, a main memory 704, an external input/output device706, and an external input/output device 708 are registered. For therespective items, a CPU load 703, a main memory load 705, an externalinput/output device load 707, and an external input/output device load709 are stored in the table 700. The states of computer resources areused as criteria to determine whether or not a computer resource updaterule is to be used.

Referring now to the flowchart of FIG. 8, description will be given of aprocessing procedure for each operating system to update computerresources. The processing is executed by the monitor agent incooperation with the operating system.

(1) In cooperation with each operation system, the monitor agentperiodically checks the operation state of the operating system andstores the operation state in the computer resource update state table700 representing load of each computer resources. The monitor agentchecks whether or not a high load is imposed on the operating systemaccording to information of the computer resource update table 600(steps 800 and 801).

(2) If the load on the operating system is low as a result of the checkin step 801, the monitor agent checks whether or not the resourcesallocated to the operating system match the default values. If thematching results, the computer resources need not be restored to theoriginal state. Therefore, the monitor agent terminates processingwithout any special operation (step 805).

(3) If mismatching results from the check to determine whether or notthe resources allocated to the operating system match the defaultvalues, the monitor agent checks to determine whether or not the stateof the computer resources can be restored to the original state. If therestoration is impossible, the monitor agent terminates processingwithout any special operation (step 806).

(4) If the load on the operating system is high as a result of the checkin step 801, the monitor agent checks to determine whether or not thepriority level of the operating system is “1” in the operating systemconfiguration table 550. If the priority level is other than “1”, themonitor agent acquires a load state of another operating system from thecomputer resource update table 700 and checks to determine whether ornot the load acquired is not high and whether or not the currentcomputer resources can be updated (steps 802 and 803).

(5) If the priority level is “1” as a result of the check in step 802,if the load acquired is not high and the current computer resources canbe updated as a result of the check in step 803, or if the state of thecomputer resources can be restored as a result of the check in step 806,the monitor agent accesses the computer resource update table 600 toacquire therefrom data to change the computer resources (step 807).

(6) If the load acquired is high and the current computer resourcescannot be updated as a result of the check in step 803, the monitoragent checks whether or not the wait flag 603 is set for the pertinentoperating system and another operating system. If the wait flag is setfor either one thereof, the monitor agent updates the wait flag of thepertinent operating system, namely, adds one to the value of the waitflag, and then terminates processing (step 804).

(7) If the wait flag is not set for the operating systems as a result ofthe check in step 804, the monitor agent accesses the computer resourceupdate table 600 to acquire therefrom data to change the computerresources. The monitor agent then acquires additional computer resourcesaccording to the data (step 808).

(8) After step 807 or 808, the monitor agent issues a computer resourceupdate request to the operation system management controller which canactually update the computer resources. After the update of computerresources is finished, the monitor agent updates the computer resourceupdate state table 700 (steps 809 and 810).

According to the second embodiment of the present invention, the load onthe operating system is monitored through the processing above. Acomputer resource update request is issued according to the state of theoperating system load to thereby efficiently use computer resources. Thecomputer resources can be therefore efficiently allocated to theoperating systems.

In the second embodiment of the present invention, the computerresources to be allocated to the operating systems are changed accordingto the states of the operating systems. However, in accordance with thepresent invention, when it is necessary to manage a processing group ofeach operating system and to preferentially execute the processing ofmanagement as compared with the processing of operating systems, alarger part of computer resources can be allocated to an operatingsystem to preferentially execute processing. In accordance with thepresent invention, computer resources not allocated to any operatingsystems can be managed and the use ratio of each computer resourceallocated to operating systems is monitored. When the use ratio becomeshigh for an operating system, computer resources not allocated can beallocated to the operating system.

By applying the second embodiment of the present invention to allocationof computer resources to a batch processing system, the batch processingcan be executed in a concentrated manner and hence the batch processingcan be executed with high efficiency. An example of the batch processingwill be described.

FIG. 9 shows constitution of a computer resource update table for abatch processing system. FIG. 10 shows in a flowchart a processingprocedure to update computer resources for the batch processing system.In this example, each time batch processing is executed, the computerresources of the batch processing system are updated. When othercomputer resources are available, the resources are allocated to thebatch processing system to execute the batch processing in aconcentrated manner.

In FIG. 9, a computer resource update table 900 includes an entry fieldof a job name 901 of batch processing. For each job name 901, the table900 includes an operating system name 902 for execution of the job, ajob start time 903, a job end time 904, and computer resource updateconditions 906 of each computer resource between the job start time andthe job end time. The conditions 906 are specified in the same fashionas for the computer resource update table 600. Like the table 600, thetable 900 includes a wait flag 905.

Referring next to the flowchart shown in FIG. 10, description will begiven of a processing process to update computer resources of the batchprocessing system.

(1) The monitor agent 205 or 206 monitors the job start time in thetable 900 to determine whether or not the job start time is at leastequal to the current time. If the job start time is not reached, themonitor agent 205 or 206 terminates the processing without conductingany particular operation (steps 1000 and 1001).

(2) If the job start time is reached as a result of the check in step1001, the monitor agent 205 or 206 checks to determine whether or notthe priority level of the operating system of the batch system is “1” asin the processing of FIG. 8. If the priority level is “1”, sincecomputer resources have been sufficiently allocated according to thepriority level, the monitor agents 205 and 206 terminate the processingwithout conducting any particular operation (step 1002).

(3) If the priority level is other than “1” as a result of the check instep 1002, the monitor agents 205 and 206 check load of anotheroperating system. If a heavy load is imposed on the operating system,the monitor agent 205 or 206 assumes that allocation of computerresources is impossible and hence terminates the processing withoutconducting any particular operation (step 1003).

(4) If a heavy load is not imposed on the operating system as a resultof the check in step 1003, the monitor agent 205 or 206 checks the waitflag 905 to determine whether or not another operating system is in await state after having issued a computer resource update request. Ifthere exists an operating system for which the wait flag has been set,the processing is terminated (step 1004).

(5) If the wait flag has not been set as a result of the check in step1004, the monitor agent 205 or 206 assumes that computer resources canbe updated and acquires an update condition from the computer resourceupdate table 900 and then notifies the condition to the multipleoperating system management controller 200 which actually updates thecomputer resources. After the resource update is finished, the monitoragent 205 or 206 writes the associated state in the computer resourceupdate state table 700 and then terminates the processing (steps 1005 to1007).

After the processing shown in FIG. 10, the batch processing systemexecutes batch processing by use of the updated computer resources.After the batch processing is completed or when the job end time 904 ofthe computer resource update table 900 is reached, the monitor agent 205or 206 restores the state of the computer resources to the state beforethe update.

As above, the monitor agents manage the states of computers toappropriately update the computer resources. Therefore, the computerresources can be efficiently allocated to a plurality of operatingsystems.

In general, when a computer of a multiprocessor environment is used as aserver machine, a plurality of users access the server machine. In thissituation, to guarantee processing performance of the computer accessed,the computer resources are subdivided to be allocated to the respectiveusers. The system to update the computer resources according to thepresent invention can be assigned to the server machine and anaccounting operation or a charging operation can be conducted accordingto the states of update of the computer resources. By the accountingmethod, the users can flexibly use the server machine. Description willbe next described of a third embodiment of the present invention. In thethird embodiment, the computer resource update technique is applied toan accounting method.

FIG. 11 shows a configuration of a user configuration area 1100including a user account table 1110 and a user register table 1120necessary for the third embodiment. FIG. 12 shows a configuration of acomputer resource account standard area 1200 including account standardtables necessary for account calculation. FIG. 13 shows in a flowchart aprocessing procedure for the accounting operation according to use ofcomputer resources.

The user configuration area 1100 includes the user account table 1110and the user register table 1120. The table 1110 is updated according toa request from each user to update computer resources. The table 1110includes a user identifier field 1111. For each user identifier 1111,the table 1110 includes a time charged field 1112, CPU charge statefields 1113 and 1114, main memory charge state fields 1115 and 1116,external input/output device charge state fields 1117 and 1118, and atotal charge state field 1119. Appropriate values are set to therespective fields or the contents thereof are appropriately updated. Theregister table 1120 stores data initially registered by users andincludes a user identifier field 1121. For each user ID 1121, the table1120 includes a start time field 1122, an end time field 1123, a CPUregistration field 1124, a main memory registration field 1125, anexternal input/output device registration field 1126, and a standardaccount field 1127. The fields up to the field 1126 are initialized tothe respective initial values. According to the values initialized, astandard account value is calculated to be set to the standard accountfield 1127.

In the third embodiment of the present invention, the accountingoperation is carried out according to information set to the useraccount table 1110 and the user register table 1120 shown in FIG. 11. Inthe tables, a circle indicates that the pertinent item is free of chargeexcepting the standard charge. For example, when user 1 uses one CPU,the user can conduct operation without any additional charge. When theuser uses computer resources exceeding the standard computer resources,the user is charged for the resources used and the charge is added tothe standard charge.

The computer resource account standard area 1200 of FIG. 12 includestables to which account items are set according to respective conditionsto conduct the accounting operation. The tables are a CPU accountstandard table 1210, a main memory account standard table 1230, and anexternal device account standard table 1250. Standard values for theaccounting operation are set to the tables. Each table includes items ofaccount standard values determined for each computer resource accordingto a use time and a use amount of the pertinent resource.

Referring now to the flowchart of FIG. 13, description will be given ofan accounting procedure according to use of computer resources.

(1) The system confirms operation or use of the system by a user. Thesystem acquires a condition from the user, namely, an execution requestto execute processing specified by the user. Whether or not a requestfor computer resource is present is checked. If the request is absent,the computer resources are not updated and control goes to processing ofa request from, for example, another user (steps 1300 to 1302).

(2) When a user request for resource update is present as a result ofthe check in step 1302, a processing request is issued to the multipleoperating system management controller 200 according to the condition ofthe request. After the computer resources are updated, the computerresource update state table 700 is updated, and an associated conditionis registered to an associated account table (steps 1303 and 1304).

(3) After the account condition is updated, the state of the computerresources is restored to the original state, a use time and accountinformation are stored in the user account table 1110, and theprocessing is terminated (step 1305).

According to the third embodiment of the present invention, the user canflexibly update the computer environment. Only when necessary, ahigh-performance computer system can be supplied. When unnecessary, acomputer system having minimum performance is supplied. Therefore, theuser can use the computer system at a flexible cost.

FIG. 14 shows in a block diagram a schematic configuration of the fourthembodiment of a computer system in which operating systems are arrangedin a cluster layout according to the present invention. FIG. 15 shows atransmitting procedure to communicate with a shared memory in aflowchart. FIG. 16 shows a receiving procedure to communicate with ashared memory in a flowchart. The fourth embodiment will be nextdescribed.

In the fourth embodiment, the multiple operating system managementcontroller 200 includes a shared memory 1400 shared between theoperating systems. In this configuration, data in a format forcommunication is generated in the shared memory 1400. The systemconducts read and write operations to communicate the data in a virtualcommunication network without using any communication cable. The fourthembodiment of FIG. 14 is implemented by adding the shared memory 1400 tothe configuration shown in FIG. 2, specifically, to the multipleoperating system management controller 200. The shared memorycommunication may be used as an internal communication when operatingsystems are configured in a cluster layout. In the fourth embodiment,the cable 212 of the example of FIG. 2 can be dispensed with, and it ispossible to construct an inner communication network. The shared memorycommunication is accomplished by shared memory communication drivers1401 and 1402 arranged in the respective operating system areas.

The transmitting procedure of the shared memory communication is carriedout according to the flowchart shown in FIG. 15 as below.

(1) The shared memory communication driver 1401 or 1402 converts, onreceiving a transmission request from a program under control of anoperating system, transmission data associated with the transmissionrequest into a communication format. When it is assumed, for example,that the Ethernet of Transmission Control Protocol/Internet Protocol(TCP/IP) is used for communication, the transmission data is convertedinto a format of Ethernet (steps 1500 and 1501.).

(2) A write request is issued to the multiple operating systemmanagement controller 200 to write the data (step 1502).

(3) The controller 200 checks to determine whether the shared memory1400 includes an empty area. If no empty area is found, processing isonce terminated. The processing is set to a wait state to await an emptyarea in shared memory 1400 (step 1503).

(4) If an empty area is found as a result of the check in step 1503, thecontroller 200 writes the transmission data in the area of the sharedmemory 1400 (step 1504)

Since multiple operating system management controller 200 actuallywrites the transmission data in the shared memory 1400, thecommunication driver 1401 or 1402 only transfers the write data to thecontroller 200.

The receiving procedure of the shared memory communication isaccomplished according to the flowchart shown in FIG. 16 as follows. Theoperating system may issue a reception request to the shared memorycommunication driver at a fixed interval of time to receive receptiondata only if the reception data is present. Alternatively, the sharedmemory communication driver on the transmission side may notify the datatransmission to the reception side in a predetermined procedure.

(1) Having received a reception request from a program under control ofan operating system, the shared memory communication driver 1401 or 1402determines presence or absence of reception data associated with thereception request. Specifically, a check is made to determine whether ornot reception data has been written in the shared memory 1400. If noreception data is found, processing is terminated (steps 1600 and 1601).

(2) If reception data is found in the shared memory 1400 as a result ofthe check in step 1601, the reception data is read therefrom and thecommunication format of the reception data is analyzed, and an inherentpart of reception data is thereby obtained (steps 1602 and 1603).

In the example, the communication method using the shared memory makesit possible to implement a virtual communication cable using the sharedmemory without any actual communication cable. This method is also usedfor the network communication in the cluster layout. If the memoryincludes a nonvolatile memory area, the nonvolatile memory area may beassumed as a virtual external storage. By accessing the virtual externalstorage via Small Computer System Interface (SCSI), a virtual SCSIcommunication can be achieved.

According to the present invention, in a computer system in which aplurality of operating systems runs on one computer, the operatingsystems are configured in a cluster layout in an environment in whichcomputer resources can be updated and reallocated for the operatingsystems. The state of operation of each operating system is monitored.At occurrence of a failure in an active operating system, a larger partof the computer resources are allocated to another operating systemwhich is in a normal state and which becomes a new operating system.Resultantly, regardless of the failure, the computer system can beoperated without changing the processing capability thereof.

According to the present invention, the load of operating systems can bemonitored to appropriately allocate computer resources to the operatingsystems according to the load. Therefore, the processing capability canbe appropriately adjusted between the operating systems.

The specification and drawings are, accordingly, to be regarded in anillustrative rather than a restrictive sense. It will, however, beevident that various modifications and changes may be made theretowithout departing from the broader spirit and scope of the invention asset forth in the claims.

1. A computer system, comprising: means for operating a plurality ofoperating systems on one computer and allocating computer resources tothe operating systems; means for managing the computer resources; meansfor updating allocation of the computer resources to the operatingsystems and restoring the allocation thereof; means for managingcontents respectively of the update of the computer resource allocationand the restoration of the computer resource allocation in relation to astate of operation of each of the operating systems; and means forupdating the computer resource allocation and restoring the computerresource allocation according to a state of operation of each of theoperating systems. 2-9. (canceled)